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Abstract.  Interactive  programs  allow  users  to  engage  in  input  and  output  through¬ 
out  execution.  The  ubiquity  of  such  programs  motivates  the  development  of  mod¬ 
els  for  reasoning  about  their  information-flow  security,  yet  no  such  models  seem 
to  exist  for  imperative  programming  languages.  Further,  existing  language-based 
security  conditions  founded  on  noninteractive  models  permit  insecure  informa¬ 
tion  flows  in  interactive  imperative  programs.  This  paper  formulates  new  strategy- 
based  information-flow  security  conditions  for  a  simple  imperative  programming 
language  that  includes  input  and  output  operators.  The  semantics  of  the  language 
enables  a  fine-grained  approach  to  the  resolution  of  nondeterministic  choices.  The 
security  conditions  leverage  this  approach  to  prohibit  refinement  attacks  while 
still  permitting  observable  nondeterminism.  Extending  the  language  with  proba¬ 
bilistic  choice  yields  a  corresponding  definition  of  probabilistic  noninterference. 
A  soundness  theorem  demonstrates  the  feasibility  of  statically  enforcing  the  se¬ 
curity  conditions  via  a  simple  type  system.  These  results  constitute  a  step  toward 
understanding  and  enforcing  information-flow  security  in  real-world  program¬ 
ming  languages,  which  include  similar  input  and  output  operators. 


1  Introduction 

Secure  programs  should  maintain  the  secrecy  of  confidential  information.  For  sequen¬ 
tial  imperative  programming  languages,  this  principle  has  led  to  a  variety  of  information- 
flow  security  conditions  which  assume  that  all  confidential  information  is  supplied  as 
the  initial  values  of  a  set  of  program  variables.  This  assumption  reflects  an  idealized 
batch-job  model  of  input  and  output,  whereby  all  inputs  are  obtained  (as  initial  values  of 
program  variables)  from  users  before  the  program  begins  execution,  and  all  outputs  are 
provided  (as  final  values  of  program  variables)  after  program  termination.  Accordingly, 
these  security  conditions  aim  to  protect  the  secrecy  only  of  initial  values. 

Many  real-world  programs  are  interactive,  sending  output  to  and  receiving  input 
from  their  external  environment  throughout  execution.  Examples  of  such  programs  in¬ 
clude  web  servers,  GUI  applications,  and  some  command-line  applications.  The  batch- 
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job  model  is  unable  to  capture  the  behavior  of  interactive  programs  because  of  de¬ 
pendencies  between  inputs  and  outputs.  For  example,  a  program  implementing  a  chal¬ 
lenge/response  protocol  must  first  output  a  challenge  to  the  user  and  then  accept  the 
user’s  response  as  input;  clearly,  the  user  cannot  supply  the  response  as  the  initial 
value  of  a  program  variable.  In  contrast,  the  interactive  model  generalizes  the  batch-job 
model:  any  batch-job  program  can  be  simulated  by  an  interactive  program  that  reads  the 
initial  values  of  all  relevant  variables,  executes  the  corresponding  batch-job  program, 
and  finally  outputs  the  values  of  all  variables. 

Given  the  prevalence  of  interactive  programs,  it  is  important  to  be  able  to  reason 
about  their  security  properties.  Traditionally,  researchers  have  reasoned  about  informa¬ 
tion  flow  in  interactive  systems  by  encoding  them  as  state  machines  (e.g..  Mantel  [19] 
and  McLean  [22, 23])  or  as  concurrent  processes  (e.g.,  Focardi  and  Gorrieri  [6])  and  ap¬ 
plying  trace-based  information-flow  security  conditions.  But  since  implementors  usu¬ 
ally  create  imperative  programs,  not  abstract  models,  a  need  exists  for  tools  that  enable 
direct  reasoning  about  the  security  of  such  programs.  This  paper  addresses  that  need 
by  developing  a  model  for  reasoning  about  the  information-flow  security  of  interactive 
imperative  programs.  Our  model  achieves  a  clean  separation  of  user  behavior  from  pro¬ 
gram  code  by  employing  user  strategies,  which  describe  how  agents  interact  with  their 
environment.  Strategies  are  closely  related  to  processes  described  in  a  language  like 
CCS  [24]  or  CSP  [17].  We  give  novel  strategy-based  semantic  security  conditions  sim¬ 
ilar  to  Wittbold  and  Johnson’s  definition  of  nondeducibility  on  strategies  [38],  which 
ensure  that  confidential  information  cannot  flow  from  high-confidentiality  users  to  low- 
confidentiality  users.  We  also  leverage  previous  work  on  static  analysis  techniques  by 
adapting  the  type  system  of  Volpano,  Smith,  and  Irvine  [37]  to  an  interactive  setting. 

Our  language  and  security  conditions  synthesize  two  branches  of  information-flow 
security  research,  in  that  we  leverage  the  trace-based  definitions  that  have  been  pro¬ 
posed  for  interactive  systems  to  provide  novel  security  conditions  for  imperative  pro¬ 
grams.  Furthermore,  our  interactive  programming  language  can  be  viewed  as  a  spec¬ 
ification  language  for  interactive  systems  that  more  closely  approximates  the  imple¬ 
mentation  of  real  programs  than  the  abstract  system  models  that  have  previously  been 
used. 

Nondeterminism  arises  in  real-world  systems  for  a  number  of  reasons,  including 
concurrency  and  probabilistic  randomization.  It  is  therefore  an  important  consideration 
when  reasoning  about  imperative  programs.  Nondeterminism  is  orthogonal  to  interac¬ 
tivity,  but  the  interplay  between  information  flow  and  nondeterminism  is  often  quite 
subtle.  We  examine  two  kinds  of  nondeterministic  choices:  those  which  we  assume  are 
made  probabilistically,  and  those  which  we  are  unable  or  unwilling  to  assign  probabili¬ 
ties.  We  refer  to  the  former  as  probabilistic  choice,  and  to  the  latter  as  nondeterministic 
choice.  Following  Halpern  and  Tuttle  [15],  we  factor  out  nondeterministic  choice  so  that 
we  can  reason  about  it  in  isolation  from  probabilistic  choice.  By  explicitly  representing 
the  resolution  of  nondeterministic  choice  in  the  language  semantics,  we  adapt  our  secu¬ 
rity  condition  to  rule  out  refinement  attacks  in  which  the  resolution  of  nondeterministic 
choice  results  in  insecure  information  flows.  Finally,  we  give  a  security  condition,  based 
on  Gray  and  Syverson’s  definition  of  probabilistic  noninterference  [11],  that  rules  out 
probabilistic  information  flows  in  randomized  interactive  programs. 
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In  Section  2  we  develop  our  system  model  and  introduce  mathematical  structures 
for  reasoning  about  the  behavior  and  observations  of  users.  We  proceed  to  instantiate 
the  model  on  a  simple  language  of  while -programs  in  Section  3  and  to  give  an  opera¬ 
tional  semantics  and  security  condition  for  the  language.  We  then  incorporate  language 
features  for  nondeterministic  choice  (Section  4)  and  probabilistic  choice  (Section  5)  and 
adapt  our  security  conditions  accordingly.  In  Section  6  we  demonstrate  the  feasibility  of 
statically  enforcing  our  security  condition  by  presenting  a  sound  type  system.  Section 
7  discusses  related  work,  and  Section  8  concludes. 

2  User  Strategies 

It  might  seem  at  first  that  information-flow  security  for  interactive  programs  can  be 
obtained  by  adopting  the  same  approach  used  for  batch-job  programs,  that  is,  by  pre¬ 
venting  low-confidentiality  users  from  learning  anything  about  high-confidentiality  in¬ 
puts.  (Hereafter  we  use  the  more  concise  terms  “high”  and  “low”  when  describing  the 
confidentiality  level  associated  with  inputs,  users,  and  so  on.)  However,  several  papers, 
starting  with  Wittbold  and  lohnson  [38],  have  described  systems  in  which  high  users 
can  transmit  information  to  low  users  even  though  low  users  learn  nothing  about  the 
high  inputs.  This  is  demonstrated  by  Program  Pi  below,  an  insecure  one-time  pad  im¬ 
plementation  described  by  Wittbold  and  lohnson.  Command  input  x  from  C  reads  a 
value  from  a  channel  named  C  and  stores  it  in  variable  x\  similarly,  output  e  to  C  out¬ 
puts  the  value  of  expression  e  on  a  channel  named  C.  Assume  that  low  users  may  use 
only  channel  L,  that  high  users  may  use  channel  //,  and  that  no  users  may  observe  the 
values  of  program  variables.  Infix  operator  []  nondeterministically  chooses  to  execute 
one  of  its  two  operands. 


P i  :  while  (true)  do 

x  :=  0  []  x  :=  1; 

output  x  to  H\ 

input  y  from  H; 

output  x  xor  (y  mod  2)  to  L 

If  nondeterminism  is  resolved  in  a  way  that  is  unpredictable  to  the  low  user,  he  will  be 
unable  to  determine  the  inputs  on  channel  H :  for  any  output  on  L,  the  input  on  H  could 
have  been  either  0  or  1.  Yet  the  high  user  can  still  communicate  an  arbitrary  confidential 
bit  z  to  channel  L  at  each  iteration  of  the  loop  by  choosing  z  xor  x  as  input  on  H. 

The  confidential  information  z  is  never  directly  acquired  by  the  program:  it  is  nei¬ 
ther  the  initial  value  of  a  program  variable  nor  an  input  supplied  on  a  channel.  As 
Wittbold  and  lohnson  observe,  maintaining  the  secrecy  of  all  high  inputs  (and  even  the 
initial  values  of  program  variables)  is  therefore  insufficient  to  preserve  the  secrecy  of 
confidential  information. 

In  Program  Pi,  the  high  user  is  able  to  communicate  arbitrary  confidential  informa¬ 
tion  by  selecting  his  next  input  as  a  function  of  outputs  he  has  previously  received.  This 
suggests  that  if  we  want  to  prevent  confidential  information  from  flowing  to  low  users, 
we  should  protect  the  secrecy  of  the  function  that  high  users  employ  to  select  inputs. 
Following  Wittbold  and  lohnson’s  terminology,  we  call  this  function  a  user  strategy.  In 
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the  remainder  of  this  section  we  develop  the  mathematical  structures  needed  to  define 
user  strategies  formally. 

2.1  Types,  Users,  and  Channels 

We  assume  a  set  C  of  security  types  with  ordering  relation  <  and  use  metavariable 
r  to  range  over  security  types.  For  simplicity,  we  assume  that  C  equals  {L,  H}  with 
L  <  H.  (Our  results  generalize  to  partial  orders  of  security  types.)  Security  type  L 
represents  low  confidentiality,  and  H  represents  high  confidentiality.  The  ordering  < 
indicates  the  relative  restrictiveness  of  security  types:  high-confidentiality  information 
is  more  restricted  in  its  use  than  low-confidentiality  information. 

Users  are  agents  (including  humans  and  programs)  that  interact  with  executing  pro¬ 
grams.  We  associate  with  each  user  a  security  type  indicating  the  highest  level  of  con¬ 
fidential  information  that  the  user  is  permitted  to  read.  Conservatively,  we  assume  that 
users  of  the  same  security  type  may  collaborate  while  attempting  to  subvert  the  security 
of  a  program.  We  can  thus  simplify  our  security  analyses  by  reasoning  about  exactly  two 
users,  one  representing  the  pooled  knowledge  of  low  users  and  another  representing  the 
pooled  knowledge  of  high  users. 

We  also  assume  the  existence  of  channels  with  blocking  input  and  nonblocking  out¬ 
put.  Although  input  is  blocking,  we  assume  that  all  inputs  prompted  for  are  eventually 
supplied.  Each  channel  is  associated  with  a  security  type  r,  and  only  users  of  that  type 
are  permitted  to  use  the  channel.  For  simplicity,  we  assume  that  there  are  exactly  two 
channels,  L  and  //.  We  also  assume  that  the  values  that  are  input  and  output  on  chan¬ 
nels  are  integers.  These  are  not  fundamental  restrictions;  our  results  could  be  extended 
to  allow  multiple  channels  of  each  type,  to  allow  high  users  to  observe  low  channels, 
and  to  allow  more  general  data  types. 

2.2  Traces 

An  event  is  the  transmission  of  an  input  or  output  on  a  channel.  Denote  the  input  of 
value  v  on  the  channel  of  type  r  as  m(r,  v)  and  the  output  of  v  on  r  as  out  (t,  v ).  Let 
Ev(t)  be  the  set  of  all  events  that  could  occur  on  channel  r: 

Ev(t)  =  (^J  {in(r,v),  out(r,v)}. 

•ugz 

Let  Ev  be  the  set  of  all  events: 

Ev  =  Ev(t). 

We  use  metavariable  a  to  range  over  events  in  Ev. 

A  trace  is  a  finite  list  of  events.  Given  E  C  Ev,  an  event  trace  on  E  is  a  finite, 
possibly  empty  list  (a\, . . . ,  an)  such  that  a;  £  £  for  all  i.  The  empty  trace  is  writ¬ 
ten  ().  The  set  of  all  traces  on  E  is  denoted  Tr(A’),  and  we  abbreviate  the  set  of  all 
traces  Tr(Ev)  as  Tr.  Trace  equality  is  defined  pointwise,  and  the  concatenation  of  two 
traces  t  and  t'  is  denoted  t't' .  A  trace  t'  extends  trace  t  if  there  exists  a  trace  t"  such  that 
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t'  =  t't" .  The  restriction  oft  to  E ,  denoted  t\E,  is  the  trace  that  results  from  removing 
all  events  not  contained  in  E  from  t.  We  write  t,  \  r  as  shorthand  for  t,  \  Ev(r).  A  low 
trace  is  the  low  restriction  t  \  L  of  a  trace  t. 

2.3  User  Strategies 

As  demonstrated  by  Program  Pi,  the  input  supplied  by  a  user  may  depend  on  past  events 
observed  by  that  user.  To  capture  this  dependence  we  employ  a  user  strategy,  which 
determines  the  input  for  a  particular  channel  as  a  function  of  the  events  that  occur  on 
the  channel.  Because  events  on  a  channel  include  both  inputs  and  outputs,  this  function 
depends  on  both  the  user’s  observations  and  previous  actions.  Formally,  a  user  strategy 
for  a  channel  with  security  type  r  is  a  function  of  type  Tr(Ev(r))  — »  Z.  Let  UserStrat 
be  the  set  of  all  user  strategies.  (Note  that,  to  simulate  the  batch-job  model,  the  initial 
inputs  provided  by  users  can  be  represented  by  a  constant  strategy  that  selects  inputs 
without  regard  for  past  inputs  or  outputs.  Also,  high  user  strategies  can  be  extended  to 
depend  on  observation  of  the  low  channel,  as  described  at  the  end  of  Section  3.) 

As  an  example,  we  present  a  strategy  that  a  high  user  could  employ  to  transmit  an 
arbitrary  stream  of  bits  Z1Z2  ■  ■  ■  to  the  low  user  in  Program  Pi.  This  user  strategy,  g, 
ensures  that  if  b  was  the  previous  output  on  H,  then  the  next  input  on  PI  is  the  bitwise 
exclusive -or  of  b  and  z, .  Note  that  every  second  event  on  channel  II  is  an  input  event 
in(H,  v). 

{Zi  xor  b  if  an  =  out(H,  b) 
and  n  =  2i  —  1 
0  otherwise 

A  joint  strategy  is  a  collection  of  user  strategies,  one  for  each  channel.  Formally,  a 
joint  strategy  jj  is  a  function  of  type  £  —r  UserStrat,  that  is,  a  function  from  security 
types  to  user  strategies.  Let  Strat  be  the  set  of  all  joint  strategies. 

3  Noninterference  for  Interactive  Programs 

While-programs,  extended  with  commands  for  input  and  output,  constitute  our  core 
interactive  programming  language.  The  syntax  of  this  language  is: 

(expressions)  e::=n  \  x  \  eg  ®  ei 
(commands)  c  ::=  skip  |  x  :=  e  \  c0;ci  \ 

input  a-  from  r  |  output  e  to  r  | 
if  e then  c0  else ci  |  while e  doc 

Metavariable  x  ranges  over  Var,  the  set  of  all  program  variables.  Variables  take  values 
in  Z,  the  set  of  integers.  Literal  values  n  also  range  over  integers.  Binary  operator  ® 
denotes  any  total  binary  operation  on  the  integers. 

3.1  Operational  Semantics 

The  execution  of  a  program  modifies  the  values  of  variables  and  produces  events  on 
channels.  A  state  determines  the  values  of  variables.  Formally,  a  state  is  a  function 
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(Assign) 


(Seq-1) 


(x  :=  e,a,t,u>)  — >  (skip,  a[x  :=  <r(e)],t,u;)  (skip;  c,  a,  t,  u>)  — »  (c,a,t,u>) 
(Seq-2) 

(cq,  a,  t,  uj)  — >  (cp,  a',  a;) 

(c0;  Ci,a,t,  u>)  — ♦  (co;ci,<7',t',w) 

(IN) 

u)(r)(t  \  t)  =  v 

(input  x  from  r,  a,  t,u>)  — >  (skip,  o[x  :=  v],t~{in(r,  v)),u>) 

(Out) 

a(e)  =  v 

(output  e  to  t,  a,  t,  u>)  — >  (skip,  a,  t'{out(r,  v)),u>) 

(If- 1 ) 

_ o~(e)  ^  0 _ 

(if  e  then  c0  else  ci,  a,  t,  u)  — >  (co,  <r,  t,  w) 

(IF-2) 

cr(e)  =  0 

(if  e  then  co  else  ci,  a,  t,  u>)  — >  (ci,  a,  t,  w) 

(While) 

(while  e  do  c,  a,  t,  lo)  — >  (if  e  then  (c;  while  e  do  c)  else  skip,  a,  t,  lo) 

Fig.  1.  Operational  semantics 

of  type  Var  — >  Z.  Let  er  range  over  states.  A  configuration  is  a  4-tuple  (c,  a,  t,  co) 
representing  a  system  about  to  execute  c  with  state  er  and  joint  strategy  ui.  Trace  t  is 
the  history  of  events  produced  by  the  system  so  far.  Let  m  range  over  configurations. 
Terminal  configurations,  which  have  no  commands  remaining  to  execute,  have  the  form 

(skip,cr,  f,  w). 

The  operational  semantics  for  our  language  is  a  small-step  relation  — >  on  configu¬ 
rations.  Membership  in  the  relation  is  denoted 

C C,a,t,w )  — ►  (c',a',t',w), 

meaning  that  execution  of  command  c  can  take  a  single  step  to  command  d ,  while 
updating  the  state  from  a  to  a' .  Trace  if  extends  t  with  any  events  that  were  produced 
during  the  step.  Note  that  joint  strategy  oj  is  unchanged  when  a  configuration  takes  a 
step;  we  include  it  in  the  configuration  only  to  simplify  notation  and  presentation. 

The  inductive  rules  defining  relation  — >  are  given  in  Figure  1.  The  rules  for  com¬ 
mands  other  than  input  and  output  are  all  standard.  In  Rule  ASSIGN,  u(e)  denotes  the 
value  of  expression  e  in  state  cr,  and  state  update  er[a;  :=  v\  changes  the  value  of  vari¬ 
able  x  to  v  in  cr.  Rule  In  uses  the  joint  strategy  u>  to  determine  the  next  input  event  and 
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appends  it  to  the  current  trace,  and  rule  Out  simply  appends  the  output  event  to  the 
current  trace. 

Let  — be  the  reflexive  transitive  closure  of  — >.  Intuitively,  if 

(c,  <r,  t,  ui)  — >* 

then  configuration  (c,a,t,u>)  can  reach  configuration  (c1 ,  o’ ,  t' ,  u>)  in  zero  or  more 
steps.  Configuration  m  emits  t,  denoted  rn  t,  when  there  exists  a  configuration 
( c,a,t,u> )  such  that  to  — >*  (c,a,t,u>).  Note  that  emitted  events  may  include  both 
inputs  and  outputs. 

3.2  A  Strategy-Based  Security  Condition 

We  now  develop  a  security  condition  which  ensures  that  users  with  access  only  to  chan¬ 
nel  L  do  not  learn  anything  about  the  strategies  employed  by  users  interacting  with 
channel  H.  Since  strategies  encode  the  possible  actions  that  users  may  take  as  they  in¬ 
teract  with  the  system,  protecting  the  secrecy  of  high  strategies  ensures  that  the  actions 
taken  by  high  users  cannot  affect  (or  “interfere  with”)  the  observations  of  low  users. 
The  security  condition  can  be  seen  as  an  instance  of  nondeducibility  on  strategies  as 
defined  by  Wittbold  and  Johnson  [38]  or  as  an  instance  of  definitions  of  secrecy  given 
by  Halpern  and  O’Neill  [13, 14], 

Informally,  a  program  is  secure  if,  for  every  initial  state  a,  any  trace  of  events  seen 
on  channel  L  is  consistent  with  every  possible  user  strategy  for  channel  //.  This  ensures 
that  low  users  cannot  learn  any  information,  including  inputs,  that  high  users  attempt  to 
convey — even  if  low  users  know  the  program  text. 

Definition  1  (Noninterference).  A  command  c  satisfies  noninterference  exactly  when: 

For  all  to  =  (c,  cr,  () ,  u)  and  m!  =  (c,  a,  (} ,  a/) 
such  that  uj(L)  =  u>'(L), 
and  for  all  t  such  that  m  t, 
there  exists  a  t'  such  that  t\L  =  t'  \L  and  m!  t'. 

According  to  this  condition,  the  high  strategy  ut(H)  in  rn  can  be  replaced  by  any 
other  high  strategy  without  affecting  the  low  traces  emitted.  Although  the  condition 
assumes  that  programs  begin  with  an  empty  trace  of  prior  events,  it  can  be  generalized 
to  account  for  arbitrary  traces.  (See  Appendix  A.)  Some  additional  implications  of  this 
security  condition  are  discussed  below. 

Initial  variable  values.  The  security  condition  does  not  protect  the  secrecy  of  the  initial 
values  of  variables.  More  concretely,  the  program  output  x  to  L  is  considered  secure 
for  any  x  £  Var,  whereas  the  program  input  x  from  II:  output  x  to  L  is  obviously 
considered  insecure.  The  definition  thus  reflects  our  intuition  that  high  users  interact 
with  the  system  only  via  input  and  output  events  on  the  high  channel  and  have  no 
control  over  the  initialization  of  variables.  Systems  in  which  the  high  user  controls  the 
initial  values  of  some  or  all  variables  can  be  modeled  by  prepending  commands  that 
read  inputs  from  the  high  user  and  assign  them  to  variables. 
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Variable  typings.  It  is  not  necessary  to  assign  security  types  to  program  variables  in 
order  to  determine  whether  a  program  is  secure.  (A  program  with  no  high  inputs,  for 
example,  is  secure  regardless  of  its  variables  or  their  types.)  Accordingly,  our  security 
condition  makes  no  reference  to  the  security  types  of  variables.  This  distinguishes  our 
work  from  most  batch-job  conditions,  where  variable  typings  are  fundamental.  We  do, 
however,  employ  variable  typings  for  the  static  analysis  technique  presented  in  Sec¬ 
tion  6. 

Timing  sensitivity.  Our  observational  model  is  asynchronous:  users  do  not  observe 
the  time  when  events  occur  or  the  time  that  passes  while  a  program  is  blocking  on  an 
input  command.  The  security  condition  is  thus  timing-insensitive.  We  could  incorporate 
timing  sensitivity  into  the  model  by  assuming  that  users  observe  a  “tick”  event  at  each 
execution  step  or  by  tagging  events  with  the  time  at  which  they  occur;  strategies  could 
then  make  use  of  this  additional  temporal  information. 

Termination  sensitivity.  We  make  the  standard  assumption  that  users  are  unable  to 
observe  the  nontermination  of  a  program.  Nonetheless,  our  security  condition  is  termi¬ 
nation-sensitive  when  low  events  follow  commands  that  may  not  terminate.  Consider 
the  following  program: 

P2 '■  input  a:  from  if; 

if  (x  =  0)  then  {while  (true)  do  skip}  else  skip; 

output  1  to  L 

A  high  user  can  cause  this  program  to  transmit  the  value  1  to  a  low  user.  Since  this 
would  allow  the  low  user  to  infer  something  about  the  high  strategy,  this  program  is 
insecure  according  to  our  security  condition. 

We  do  not  assume  that  users  are  able  to  observe  the  termination  of  a  program  di¬ 
rectly,  but  it  would  be  easy  to  make  termination  observable  by  adding  a  distinguished 
termination  event  that  is  broadcast  on  all  channels  when  execution  reaches  a  terminal 
configuration. 

Observation  of  channels.  We  have  assumed  that  high  users  cannot  observe  the  low 
channel,  but  this  restriction  can  be  removed  in  several  ways.  For  example,  it  is  straight¬ 
forward  to  amend  the  operational  semantics  to  echo  low  events  to  high  channels  by 
adding  an  additional  high  output  event  (prepended  with  a  label  to  distinguish  it  from  a 
regular  high  output  events)  to  the  trace  every  time  a  low  input  or  output  event  occurs. 


4  Nondeterministic  Programs 

We  distinguish  two  kinds  of  nondeterminism  that  appear  in  programs:  probabilistic 
choice  and  nondeterministic  choice.  Intuitively,  probabilistic  choice  represents  explicit 
use  of  randomization,  whereas  nondeterministic  choice  represents  program  behavior 
that  is  underspecified  (perhaps  due  to  unpredictable  factors  such  as  the  scheduler  in  a 
concurrent  setting).  Following  the  approach  of  previous  work  [15,  35],  we  factor  out  the 
latter  kind  of  nondeterminism  by  assuming  that  all  nondeterministic  choices  are  made 
as  if  they  were  specified  before  the  program  began  execution.  (The  implications  of  this 


approach  are  discussed  at  the  end  of  the  section.)  This  allows  reasoning  about  nonde- 
terministic  choice  and  probabilistic  choice  in  isolation,  and  our  definitions  of  noninter¬ 
ference  reflect  the  resulting  separation  of  concerns.  In  this  section  we  extend  our  model 
to  include  nondeterministic  choice.  We  return  to  probabilistic  choice  in  Section  5. 

4.1  Refiners 

We  extend  the  language  of  Section  3  with  nondeterministic  choice: 

c  ::=  ...  |  c0  Qr  ci 

Each  nondeterministic  choice  is  annotated  with  a  security  type  r  that  is  used  in  the  oper¬ 
ational  semantics.  The  need  for  the  annotation  is  explained  below;  we  remark,  however, 
that  the  type  system  described  in  Section  6  could  be  used  to  infer  annotations  automat¬ 
ically,  so  that  programmers  need  not  specify  them. 

To  factor  out  the  resolution  of  nondeterminism,  we  introduce  infinite  lists  of  binary 
values  called  refinement  lists.  Denote  the  set  of  all  such  refinement  lists  as  RefList. 
Informally,  when  a  nondeterministic  choice  is  encountered  during  execution,  the  head 
element  of  a  refinement  list  is  removed  and  used  to  resolve  the  choice.  The  program 
executes  the  left  command  of  the  nondeterministic  choice  if  the  element  is  0  and  the 
right  command  if  the  element  is  1 .  Refinement  lists  are  an  operational  analog  of  Milner’s 
oracle  domains  [25]  for  denotational  semantics. 

Nondeterministic  choices  should  not  cause  insecure  information  flows,  even  if  low 
users  can  predict  how  the  choices  will  be  made.  While  it  might  seem  that  using  a  single 
refinement  list  would  suffice  to  ensure  that  no  insecure  information  flows  arise  as  a 
result  of  the  resolution  of  nondeterministic  choice,  the  following  program  demonstrates 
that  this  is  not  the  case: 


input  x  from  H\ 

if  (x  =  0)  then  {skip  D#  skip}  else  skip; 
output  0  to  L  Dl  output  1  to  L 

If  the  refinement  list  (1, 0, . . .)  is  used  to  execute  this  program,  the  output  on  channel 
L  will  equal  the  input  on  channel  //.  An  insecure  information  flow  arises  because  the 
same  refinement  list  is  used  to  make  both  low  and  high  choices.  To  eliminate  this  flow, 
we  identify  the  security  type  of  a  choice  based  on  its  annotation  and  require  that  dif¬ 
ferent  lists  be  used  to  resolve  choices  at  each  type.  This  ensures  that  the  number  of 
choices  made  at  a  given  security  level  cannot  become  a  covert  channel.  (Note  that  this 
requirement  lends  itself  to  natural  implementation  techniques.  For  example,  if  choices 
are  made  by  using  a  stream  of  pseudorandom  numbers,  then  different  streams  should 
be  used  to  resolve  high  and  low  choices.  Or  if  D  represents  scheduler  choices,  then  the 
scheduler  should  resolve  choices  at  each  security  type  independently.) 

A  refiner  is  a  function  ip  :  £  — >  RefList  that  associates  a  refinement  list  with  each 
security  type.  Let  Ref  denote  the  set  of  all  refiners.  Denote  the  standard  list  operations 
of  reading  the  first  element  of  a  list  and  removing  the  first  element  of  a  list  as  head  and 
tail ,  respectively.  Given  a  refiner  ip,  the  value  head(ip(r))  is  used  to  resolve  the  next 
choice  annotated  with  type  r. 
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(Seq-2) 

(cq,  a,  ip,  t,  uj)  - >  {c'0,a',ip',t',uj) 

(c0-,ci,a,ip,t,uj)  — >  (c'0;ci,a',ip',t',uj) 


(Choice) 

head(ip(T))  =  i 

(Co  Dr  Ci,a,ip,t,uj)  — *  (d,  u,ip[r  :=  tail{ip{r))),t,uj) 

Fig.  2.  Operational  semantics  for  nondeterministic  choice 


4.2  Operational  Semantics 

Using  refiners,  we  extend  the  operational  semantics  of  Section  3  to  account  for  nonde¬ 
terministic  choice.  A  command  c  is  now  executed  with  respect  to  a  refiner  ip,  in  addition 
to  a  state  a,  trace  t,  and  joint  strategy  uj.  We  thus  modify  configurations  to  be  5-tuples 
(c,  a,  ip,  t,  uj):  terminal  configurations  now  have  the  form  (skip,  a,  ip,  t,  uj). 

All  of  the  operational  rules  from  Figure  1  are  adapted  in  the  obvious  way  to  han¬ 
dle  the  new  configurations.  The  only  interesting  change  is  Seq-2,  which  is  restated  in 
Figure  2.  Nondeterministic  choice  is  evaluated  by  the  new  rule  CHOICE,  which  uses 
refiner  ip  to  resolve  the  choice  and  specifies  how  the  refiner  changes  as  a  result.  Refiner 
ip[r  :=  tail(ip(r))\  is  the  refiner  ip  with  the  refinement  list  for  r  replaced  by  tailpipfir)). 

Note  that  a  refiner  factors  out  all  nondeterminism  in  the  program:  once  a  refiner, 
state,  and  joint  strategy  have  been  fixed,  execution  is  completely  determined. 


4.3  A  Security  Condition  for  Nondeterministic  Programs 

A  well-known  problem  arises  with  nondeterministic  programs:  they  are  vulnerable  to 
refinement  attacks,  in  which  a  seemingly  secure  program  can  be  refined  to  an  insecure 
program.  For  example,  whether  the  input  from  H  is  kept  secret  in  the  following  program 
depends  on  how  the  nondeterministic  choice  is  resolved: 

P3  :  input  x  from  H\ 

output  0  to  L  D  output  1  to  L 

If  the  choice  is  made  independently  of  the  current  state  of  the  program,  say  by  tossing 
a  coin,  the  program  is  secure.  But  if  the  choice  is  made  as  a  function  of  x,  the  program 
may  leak  information  about  the  high  input. 

To  ensure  that  a  program  is  resistant  to  refinement  attacks,  we  insist  that,  for  all 
possible  resolutions  of  nondeterminism,  the  program  does  not  leak  any  confidential 
information.  Our  model  allows  this  quantification  to  be  expressed  cleanly,  since  refiners 
encapsulate  the  resolution  of  nondeterministic  choice.  We  adapt  the  security  condition 
of  Section  3.2  to  ensure  that,  for  any  refinement  of  the  program,  users  with  access  only 
to  channel  L  do  not  learn  anything  about  the  strategies  employed  by  users  of  channel  II. 
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Definition  2  (Noninterference  Under  Refinement).  A  command  c  satisfies  noninter¬ 
ference  under  refinement  exactly  when: 


For  all  to  =  (c,  a,  ip,  (),  u>)  and  to'  =  (c,  er,  ip,  (),  lu') 
such  that  tjj(L)  = 
and  for  all  t  such  that  m  t, 
there  exists  a  t'  such  that  t\  L  =  t'  \  L  and  to'  t' . 

Some  implications  of  this  definition  are  discussed  below. 

Low-observable  nondeterminism.  This  security  condition  rules  out  refinement  attacks 
but  allows  programs  that  appear  nondeterministic  to  a  low  user.  For  example.  Program 
P3  (with  []  replaced  by  []  /j  satisfies  noninterference  under  refinement,  yet  repeated 
executions  may  reveal  different  program  behavior  to  the  low  user. 

Initial  refinement  lists.  The  security  condition  does  not  require  the  secrecy  of  the  initial 
refinement  list  for  H.  More  concretely,  the  program 

output  0  to  L  [] H  output  1  to  L 

is  considered  secure  even  though  it  reveals  information  about  the  first  value  of  ip(H). 
The  definition  thus  reflects  our  intuition  that  high  users  interact  with  the  system  only 
via  input  and  output  events  on  the  high  channel,  which  gives  them  no  control  over 
refinement  lists.  The  definition  of  noninterference  under  refinement  could  be  adapted  to 
systems  where  high  users  may  exert  control  over  refinement  lists. 

Expressivity  of  refiners.  Our  model  can  represent  only  those  refinements  that  appear 
as  if  they  were  made  before  the  program  began  execution.  Refinements  that  may  depend 
upon  dynamic  factors,  such  as  the  values  of  variables  or  the  current  program  counter, 
cannot  be  represented.  Our  model  therefore  captures  compiler-time  nondeterminism  but 
not  runtime  nondeterminism  [16].  We  leave  development  of  more  sophisticated  refiners 
as  future  work. 


5  Probabilistic  Programs 

Probabilistic  choice  can  be  seen  as  refinement  of  arbitrary  nondeterministic  choice. 
Now  that  we  have  shown  how  refiners  can  be  used  to  factor  out  the  nondeterministic 
choices  to  which  we  are  unable  or  unwilling  to  assign  probabilities,  we  can  model 
probabilistic  choice  explicitly. 

We  begin  by  extending  the  nondeterministic  language  of  Section  4  with  probabilistic 
choice: 

C  .. —  ...  Cq  p  []  d 

Informally,  probabilistic  choice  Co  p  []  c\  executes  command  Co  with  probability  p  and 
command  c\  with  probability  1  —p.  The  probability  annotation  p  must  be  a  real  number 
such  that  0  <  p  <  1.  We  assume  that  probabilistic  choices  are  made  independently  of 
one  another. 
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(Prob-1) 


(Prob-2) 


(copD  d,o,ip,t,uj)  (c0,a,f,t,u>)  (c0p[]  d,  a,  ip,  t,  w)  — 5  (ci,  a,  ip,  t,  u) 


Fig.  3.  Operational  semantics  for  probabilistic  choice 


5.1  Operational  Semantics 

To  incorporate  probability  in  the  operational  semantics  we  extend  the  small-step  relation 
— >  of  previous  sections  to  include  a  label  for  probability.  We  denote  membership  in 
the  new  relation  by 

p  i 

in  — >  to.  , 

meaning  that  configuration  m  steps  with  probability  p  to  configuration  m! .  Configu¬ 
rations  remain  unchanged  from  the  nondeterministic  language  of  Section  4.  The  new 
operational  rules  defining  this  relation  are  given  in  Figure  3.  To  facilitate  backwards- 
compatibility  with  the  operational  rules  of  previous  sections,  we  interpret  m  — ►  ml  as 
shorthand  for  to  — ->  m! .  The  operational  rules  previously  given  in  Figures  1  and  2  thus 
remain  unchanged. 

5.2  A  Probabilistic  Security  Condition 

It  is  well-known  that  probabilistic  programs  may  be  secure  with  respect  to  nonproba- 
bilistic  definitions  of  noninterference  but  leak  confidential  information  with  high  prob¬ 
ability.  As  an  example,  consider  the  following  program: 

Pi  :  input  x  from  H; 

if  x  mod  2  =  0  then 
output  0  to  L  0.99  []  output  1  to  L 
else 

output  0  to  L  o.oi  D  output  1  to  L 

If  we  regard  probabilistic  choice  p[]  as  identical  to  nondeterministic  choice  []  then 
this  program  satisfies  noninterference  under  refinement.  Yet  with  high  probability,  the 
program  leaks  the  parity  of  the  high  input  to  channel  L. 

Toward  preventing  such  probabilistic  information  flows,  observe  that  if  a  low  trace 
t  is  likely  to  be  emitted  with  one  high  user  strategy  and  unlikely  with  another,  then  the 
low  user  learns  something  about  the  high  strategy  by  observing  the  occurrence  of  t.  We 
thus  conclude  that  our  security  condition  should  require  that  the  probability  with  which 
low  traces  are  emitted  be  independent  of  the  strategy  employed  on  the  high  channel, 
that  is,  that  low-equivalent  configurations  should  produce  particular  low  traces  with  the 
same  probability.  This  intuition  is  consistent  with  security  conditions  given  by  Gray  and 
Syverson  [11]  and  Halpern  and  O’Neill  [14], 

More  formally,  let  £m(i)  represent  the  event  that  configuration  m  emits  low  trace 
t.  Suppose  that  we  had  a  probability  on  such  events.  Then  our  security  condition 
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should  require,  for  all  configurations  m  and  ml  that  are  equivalent  except  for  the  choice 
of  high  strategy,  and  all  low  traces  t,  that  /rm(£TO(i))  =  /im< (£m'(f)).  The  remainder 
of  this  section  is  devoted  to  defining  prn  and  £m(t). 

We  begin  with  two  additional  intuitions.  First,  since  probabilistic  choices  are  made 
independently,  the  probability  of  an  execution  sequence 


mo 


P  o 


m  i 


pi 


Pn-l 


mr 


is  equal  to  the  product  of  the  probabilities  pt  of  the  individual  steps.  Second,  a  config¬ 
uration  to  could  emit  the  same  trace  t  along  multiple  sequences,  so  the  probability  that 
rn  emits  t  should  be  the  sum  of  the  probabilities  associated  with  those  sequences. 

Based  on  these  intuitions,  we  now  construct  probability  measure  p.m  by  adapting  a 
standard  approach  for  reasoning  about  probabilities  on  trees  [12],  For  any  configuration 
to,  relation  — >  gives  rise  to  a  rooted  directed  probability  tree  whose  vertices  are  labeled 
with  configurations,  edges  are  labeled  with  probabilities,  and  root  is  to.  Denote  the 
probability  tree  for  to  by  Tm  and  the  set  of  vertices  of  Tm  by  Vm.  A  path  in  the  tree  is  a 
sequence  of  vertices,  starting  with  the  root,  where  each  successive  pair  of  vertices  is  an 
edge.  Given  a  vertex  v,  let  tr(v)  be  the  trace  of  events  in  the  configuration  with  which  v 
is  labeled.  We  say  that  t  appears  at  v  when  tr(v)  =  t  but  tr{v')  ^  t  for  all  ancestors  v' 
of  v.  Let  ap(t)  be  the  set  of  vertices  where  t  appears.  In  accordance  with  the  intuitions 
described  above,  let  ir(v)  be  the  product  of  the  probabilities  on  the  path  to  v. 

A  ray  is  an  infinite  path  or  a  finite  path  whose  terminal  node  has  no  descendants, 
Rays  therefore  represent  maximal  execution  sequences.  Let  lZm  denote  the  set  of  rays 
of  Tm.  Let  Rm  (v)  be  the  set  of  rays  that  go  through  vertex  v: 

Rm{v)  =  {r  G  Rm  |  v  is  on  r}. 

Let  Am  be  the  cr-algebra  on  Rm  generated  by  sets  of  rays  going  through  particular 
vertices,  that  is,  by  the  set  {Rm{v)  \  v  G  Vm}.'  The  following  result  yields  a  probability 
measure  on  sets  of  rays.  It  is  a  consequence  of  elementary  results  in  probability  theory, 
and  we  omit  the  proof. 

Theorem  1.  For  any  configuration  m,  there  exists  a  unique  probability  measure  on 
Am  such  that  for  all  v  G  Vm  we  have  p,m(Rm(v))  =  tt(v). 

Now  that  we  have  constructed  /im,  we  must  show  how  to  use  it  to  obtain  the  prob¬ 
ability  of  a  set  of  traces  in  terms  of  the  probability  of  a  corresponding  set  of  rays.  For 
a  set  T  of  traces,  let  Rm{T)  be  the  set  of  rays  on  which  a  trace  in  T  appears.  Let 
emm(T)  =  {t  G  T  |  to  t}  be  the  set  of  traces  in  T  emitted  by  to,  and  note  that 

Rrn(T)  =  U  U  RmiV^ 

tGemm(T)  v€ap(t ) 

1  A  u-algebra  on  a  set  A'  is  a  nonempty  collection  of  subsets  of  A'  that  contains  A'  and  is  closed 
under  complements  and  countable  unions  [2].  (The  a  has  no  connection  to  states,  although 
we  also  use  o  as  a  metavariable  that  ranges  over  states.)  A  cr-algebra  generated  by  a  set  C  of 
subsets  of  X  is  defined  as  the  intersection  of  all  cr-algebras  on  A',  including  2X ,  that  contain 
C. 
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because  a  trace  appears  on  a  ray  r  if  and  only  if  it  appears  at  a  vertex  v  on  r.  The  set 
Rm(T )  is  measurable  with  respect  to  Arn  because  both  emm(T)  and  Vm  are  countable 
sets.  Given  a  trace  t,  the  set  {Rm(y)  \  v  £  ap(t)}  is  a  partition  of  the  set  of  rays  on 
which  t  appears.  It  follows  that 

Hm{Rm({t}))  =  Ahn(U„eap(t) 

=  'Thveap(t)  ^rn{Rm{v)) 

^jvdap(t) 

that  is,  that  the  probability  that  m  emits  t  is  equal  to  the  sum  of  the  values  n(v)  for 
vertices  v  where  t  appears,  as  desired. 

We  can  now  define  Em(t)-  Given  a  security  type  r  and  a  trace  t,  let  [t\T  be  the 
equivalence  class  of  traces  that  are  equal  to  t  when  restricted  to  r: 

[t]T  =  {*'  e  Tr  |  t'  fr  =  t  !>}. 

Finally,  let  £m(t)  be  the  set  of  rays  on  which  there  is  some  vertex  v  such  that  tr(v )  ( 

L  =  t \L: 

£m(t)  =  Rm([t]L)- 

The  set  £m(t)  is  in  Am-  By  Theorem  1,  pm(£m{t ))  is  equal  to  the  sum  of  values  tt(v) 
for  vertices  v  such  that  tr(v )  \  L  =  t  \  L  and  tr{v')  \L  ^  t  \  L  for  any  ancestor  v'  of  v. 
We  are  now  ready  to  formalize  our  security  condition. 

Definition  3  (Probabilistic  Noninterference).  A  command  c  satisfies  probabilistic  non¬ 
interference  exactly  when: 

For  all  ?n  =  (c,  a,  ip,  (),  w)  and  m'  =  (c,  a,  ip,  (),  cj') 
such  that  l v(L)  =  ui'(L), 
and  for  all  t  G  Tr(Ev(L)), 
we  have  tim{£m{t))  =  pm'{£m'{t))- 

Returning  to  Program  at  the  start  of  this  section,  it  is  easy  to  check  that  the 
probability  of  the  low  trace  ( out(L ,  0))  is  0.99  when  the  high  strategy  is  to  input  an 
even  number,  and  0.01  when  the  high  strategy  is  to  input  an  odd  number.  Clearly,  the 
program  does  not  satisfy  probabilistic  noninterference. 

If  we  interpret  the  nondeterministic  choice  in  Program  P\  as  0.5 Q  (a  fair  coin  toss), 
the  program  does  not  satisfy  probabilistic  noninterference.  However,  if  the  output  to  H 
is  removed,  the  resulting  program 

while  (true)  do 

x  :=  0  0.5  0  x  :=  1; 

input  y  from  H; 

output  x  xor  ( y  mod  2)  to  L 

does  satisfy  noninterference.  The  probability  of  low  outputs  is  independent  of  the  high 
strategy,  which  can  no  longer  exploit  knowledge  of  the  value  of  one-time  pad  x. 

User  strategies  as  defined  thus  far  are  deterministic.  However,  our  approach  to  rea¬ 
soning  about  probability  applies  to  randomized  user  strategies  as  well  as  to  randomized 
programs,  so  it  would  be  straightforward  to  adapt  our  model  to  handle  randomized 
strategies. 
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6  A  Sound  Type  System 


The  problem  of  characterizing  programs  that  satisfy  noninterference  is,  for  many  defini¬ 
tions  of  noninterference,  intractable.  For  definitions  appearing  in  the  previous  sections, 
there  is  a  straightforward  reduction  from  the  halting  problem  to  the  noninterference 
problem.  It  follows  that  no  decision  procedure  for  certifying  the  information-flow  se¬ 
curity  of  programs  can  be  both  sound  and  complete  with  respect  to  our  definitions  of 
noninterference.  The  goal  of  this  section  is  to  demonstrate  that  static  analysis  techniques 
can  be  used  to  identify  secure  programs. 

We  use  a  type  system  based  on  that  of  Volpano,  Smith,  and  Irvine  [37].  It  is  interest¬ 
ing  to  note  that  a  type-system  designed  to  enforce  batch-job  noninterference  conditions 
also  enforces  our  interactive  conditions,  including  probabilistic  noninterference,  even 
though  the  type  system  is  oblivious  to  the  subtleties  of  probability,  interactivity,  and 
user  strategies.  We  believe  that  other  type  systems  for  information  flow  (e.g.,  [3, 18, 
32, 34])  can  also  be  easily  adapted  for  our  interactive  model,  and  thus  that  advances  in 
precision  and  expressiveness  can  be  applied  to  our  work. 

The  type  system  consists  of  a  set  of  axioms  and  inference  rules  for  deriving  typing 
judgments  of  the  form  T  h  p  :  k,  meaning  that  phrase  p  has  phrase  type  k  under 
variable  typing  T.  A  phrase  is  either  an  expression  or  a  command.  A  phrase  type  is 
either  a  security  type  r  or  a  command  type  r  cmd,  where  r  £  C.  A  variable  typing 
is  a  function  I’  :  Var  — >  C  mapping  from  variables  to  security  types.  Informally,  a 
command  c  has  type  r  and  when  r  is  a  lower-bound  on  the  effects  that  c  may  have,  that 
is,  when  the  types  (under  T)  of  any  variables  that  c  updates  are  bounded  below  by  r, 
and  any  input  or  output  that  c  performs  is  on  channels  whose  security  type  is  bounded 
below  by  r. 

Axioms  and  inference  rules  for  the  type  system  are  given  in  Figure  4.  There  are 
two  types  of  rules:  typing  rules  (prefixed  with  “T”)  and  subtyping  rules  (prefixed  with 
“ST”).  Typing  rules  can  be  used  to  infer  the  type  of  an  expression  or  command  directly. 
Subtyping  rules  allow  a  low-typed  expression  to  be  treated  as  a  high-typed  expression 
and  a  high-typed  command  to  be  treated  as  a  low-typed  command.  (It  is  safe,  for  exam¬ 
ple,  to  store  a  low-typed  expression  in  a  high  variable,  or  to  output  data  to  a  high  user 
in  the  body  of  a  loop  with  a  low-typed  guard.) 

Most  of  the  rules  in  this  type  system  are  standard.  Rules  T-In  and  T-Out  are  both 
similar  to  T-ASSIGN:  T-In  ensures  that  values  read  from  the  r  channel  are  stored  in 
variables  whose  type  is  bounded  below  by  r,  whereas  T-Out  ensures  that  only  r-typed 
expressions  are  output  on  the  r  channel.  Rules  T-Choice  and  T-Prob  are  similar  to  T- 
Seq,  except  that  T-Choice  also  checks  that  the  typing  is  consistent  with  the  syntactic 
type  annotation.  Rule  T- W HILE  forbids  high-guarded  loops,  ensuring  that  loop  termi¬ 
nation  does  not  depend  on  the  high  user’s  strategy.  This  prohibits  insecure  programs 
such  as  Pi  (in  Section  3.2).  We  believe  this  rule  could  be  relaxed  using  techniques 
described  by  Boudol  and  Castellani  [3]  and  Smith  [34]. 

The  following  theorem  states  that  this  type  system  soundly  enforces  noninterfer¬ 
ence.  Recall  that  our  security  conditions  do  not  depend  on  the  security  types  of  vari- 
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(T-Lit)  (T-Var)  (T-Op) 

T(ai)  —  t  T  h  eo  :  r  T  h  ei 
r  h  n  :  t  r  h  x  :  t  F  h  eo  0  ei  :  r 

(T-Skip) 

r  h  skip  :  r  cmd 

(T-If) 

r  h  e  :  r  F  h  Co  :  r  cmd  T  h  ci  :  r  cmd 
r  I-  if  e  then  c0  else  ci  :  r  cmd 
(T-Seq) 

F  h  Co  :  t  cmd  T  h  ci  :  r  cmd 
r  h  Co;  ci  :  t  cmd 

(T-While)  (T-Choice) 

The:!/  Their  cmd  r  h  Co  :  r  cmd  T  h  ci  :  t  cmd 

T  h  while  e  do  c  :  L  cmd  T  h  Co  |T  ci  :  t  cmd 

(T-Prob) 

T  h  Co  :  t  cmd  T  h  ci  :  r  cmd 
T  h  co  P  |  ci  :  r  cmd 

(T-In)  (T-Out) 

T(x)  .5=  t'  r  <  t'  Their 

T  h  input  x  from  r  :  r  cmd  T  h  output  e  to  r  :  r  cmd 

(T-Subtype) 

T  h  p  :  Ko  K 0  <  Kl 
T  h  p  :  Ki 

(ST-Base)  (ST-Refl)  (ST-Cmd) 

_  _  rp  <  r i 

L  <  H  k  <  k  n  cmd  <  ro  cmd 

Fig.  4.  Typing  rules 


(T-ASSIGN) 

:  t  T(ir)  =  t  T  h  e  :  r 
T  h  x  :=  e  :  r  cmrf 
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ables.  Noninterference  is  enforced  provided  there  exists  some  variable  typing  under 
which  the  program  is  well-typed.2  The  proof  is  in  Appendix  A. 

Theorem  2  (Soundness).  For  any  command  c,  if  there  exists  a  variable  typing  T  and  a 
security  type  r  such  that  T  b  c  :  r  and,  then 

(a)  if  c  does  not  contain  nondeterministic  or  probabilistic  choice,  then  c  satisfies  non¬ 
interference; 

(b)  if  c  does  not  contain  probabilistic  choice,  then  c  satisfies  noninterference  under 
refinement;  and 

(c)  c  satisfies  probabilistic  noninterference. 

7  Related  Work 

Definitions  of  information-flow  security  for  imperative  programs  began  with  the  work 
of  Denning  [5],  Many  subsequent  papers  define  information-flow  security  for  vari¬ 
ous  sequential  imperative  languages,  but  nearly  all  of  these  papers  assume  a  batch-job 
model  of  computation.  Therefore,  they  attempt  to  ensure  the  secrecy  of  high-typed  pro¬ 
gram  variables  rather  than  of  the  behavior  of  high  users  who  interact  with  the  system. 
See  Sabelfeld  and  Myers  [30]  for  a  survey  of  language-based  information-flow  security. 

Another  line  of  work  considers  end-to-end  information-flow  restrictions  for  non¬ 
deterministic  systems  that  provide  input  and  output  functionality  for  users.  Definitions 
of  noninterference  exist  both  for  abstract  systems  (such  as  finite  state  machines)  that 
include  input  and  output  operations  (Goguen  and  Meseguer  [10],  McCullough  [21], 
McLean  [23],  Mantel  [19]),  and  for  systems  described  using  process  algebras  such 
as  CCS,  the  7r-calculus,  and  related  formalisms  (Focardi  and  Gorrieri  [6],  Ryan  and 
Schneider  [28],  Zdancewic  and  Myers  [39]). 

Definitions  of  noninterference  based  on  process  algebras  typically  require  that  the 
observations  made  by  a  public  user  are  the  same  regardless  of  which  high  processes  (if 
any)  are  interacting  with  the  system.  These  definitions  are  thus  similar  in  spirit  to  our 
definitions  of  noninterference.  Indeed,  there  is  a  close  connection  between  strategies 
and  processes:  both  can  be  viewed  as  description  of  how  an  agent  will  behave  in  an 
interactive  setting.  A  formal  comparison  with  process-based  definitions  (such  as  [8]) 
may  uncover  further  connections  between  process-based  system  models  and  imperative 
programs. 

Wittbold  and  Johnson  [38]  give  the  first  strategy-based  definition  of  information- 
flow  security,  and  Gray  and  Sy verson  [11]  give  a  strategy -based  definition  of  proba¬ 
bilistic  noninterference.  Halpern  and  O’Neill  [14]  generalize  the  definitions  of  Gray 
and  Syverson  to  account  for  richer  system  models  and  more  general  notions  of  uncer¬ 
tainty.  Our  definitions  of  noninterference,  which  are  instances  of  Halpern  and  O’Neill’s 
definitions  of  secrecy,  are  the  first  strategy-based  security  conditions  for  an  imperative 
programming  language  of  which  we  are  aware.  Our  work  can  thus  be  viewed  as  a  uni¬ 
fication  of  two  distinct  strands  of  the  information-flow  literature.  In  this  sense  our  work 

2  Because  the  security  types  of  variables  can  be  inferred,  programmers  need  not  specify  them. 
In  a  (trivially  secure)  program  with  no  high  inputs,  for  example,  all  variables  can  be  assigned 
type  L. 
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is  similar  to  that  of  Mantel  and  Sabelfeld  [20],  who  demonstrate  a  connection  between 
security  predicates  taken  from  the  MAKS  framework  of  Mantel  [19]  and  bisimulation- 
based  definitions  of  security  for  a  concurrent  imperative  language  due  to  Sabelfeld  and 
Sands  [31].  However,  Mantel  and  Sabelfeld  do  not  consider  interactive  programs. 

Our  probabilistic  noninterference  condition  can  be  interpreted  as  precluding  pro¬ 
grams  that  allow  low  users  to  make  observations  that  improve  the  accuracy  of  their  be¬ 
liefs  about  high  behavior,  that  is,  their  beliefs  about  which  high  strategy  is  used.  Halpern 
and  O’Neill  [14]  prove  a  result  which  implies  that  our  probabilistic  security  condition 
suffices  to  ensure  that  low  users  cannot  improve  the  accuracy  of  their  subjective  beliefs 
about  high  behavior  by  interacting  with  a  program.  Our  probabilistic  security  condition 
also  ensures  that  the  quantity  of  information  flow  due  to  a  secure  program  is  exactly 
zero  bits  in  the  belief-based  quantitative  information-flow  model  of  Clarkson,  Myers, 
and  Schneider  [4], 

The  bisimulation-based  security  condition  of  Sabelfeld  and  Sands  [31]  can  be  viewed 
as  a  relaxation  of  the  batch-job  model.  However,  as  Mantel  and  Sabelfeld  [20]  point 
out,  bisimulation-based  definitions  are  difficult  to  relate  to  trace-based  conditions  when 
a  nondeterministic  choice  operator  is  present  in  the  language.  The  following  program, 
for  example,  satisfies  both  noninterference  under  refinement  and  probabilistic  noninter¬ 
ference  (for  suitable  interpretations  of  the  []  operator),  but  it  is  not  secure  with  respect 
to  a  bisimulation-based  definition  of  security: 

input  x  from  H; 

if  ( x  =  0) 

output  0  to  L\ 

{output  1  to  L  []  output  2  to  L} 
else 

{output  0  to  L;  output  1  to  L}  [] 

{output  0  to  L;  output  2  to  L} 

Bisimulation-based  security  conditions  implicitly  assume  that  users  can  observe  inter¬ 
nal  choices  made  by  a  program.  When  users  observe  only  inputs  and  outputs  on  chan¬ 
nels,  our  observational  model  is  more  appropriate. 

Interactivity  between  users  and  a  program  is  similar  to  message-passing  between 
threads.  Sabelfeld  and  Mantel  [29]  present  a  multi-threaded  imperative  language  with 
explicit  send,  blocking  receive,  and  non-blocking  receive  operators  for  communication 
between  processes.  They  describe  a  bisimulation-based  security  condition  and  a  type 
system  to  enforce  it.  However,  it  is  not  clear  how  to  model  user  behavior  in  their  set¬ 
ting.  Users  cannot  be  modeled  as  processes  since  user  behavior  is  unknown,  and  their 
security  condition  applies  only  if  the  entire  program  is  known. 

Almeida  Matos,  Boudol,  and  Castellani  [1]  state  a  bisimulation-based  security  con¬ 
dition  for  reactive  programs ,  which  allow  limited  communication  between  processes, 
and  they  give  a  sound  type  system  to  enforce  the  condition.  In  their  language,  programs 
react  to  the  presence  and  absence  of  named  broadcast  signals  and  can  emit  signals  to 
other  programs  in  a  “local  area.”  It  is  possible  to  implement  our  higher-level  channels 
and  events  within  a  local  area,  using  their  lower-level  reactivity  operators.  However,  it 
is  unclear  how  to  use  reactivity  to  model  interactions  with  unknown  users  who  are  not 
part  of  a  local  area. 
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Focardi  and  Rossi  [7]  study  the  security  of  processes  in  dynamic  contexts  where  the 
environment,  including  high  processes,  can  change  throughout  execution.  This  is  simi¬ 
lar  to  how  high  user  strategies  describe  changing  inputs  throughout  execution.  However, 
user  strategies  depend  upon  the  history  of  the  computation,  whereas  dynamic  contexts 
do  not,  so  it  is  unclear  how  to  encode  a  user  strategy  using  dynamic  contexts. 

Previous  work  dealing  with  the  susceptibility  of  possibilistic  noninterference  to 
refinement  attacks  takes  one  of  two  approaches  to  specifying  how  nondeterministic 
choice  is  resolved.  One  approach  is  to  assume  that  choices  are  made  according  to  fixed 
probability  distributions,  as  we  do  in  Section  5.  Volpano  and  Smith  [36],  for  exam¬ 
ple,  describe  a  scheduler  for  a  multithreaded  language  that  chooses  threads  to  execute 
according  to  a  uniform  probability  distribution.  A  second  approach  is  to  insist  that  pro¬ 
grams  be  obsen’ationally  deterministic  for  low  users.  McLean  [22]  and  Roscoe  [27] 
both  advocate  observational  determinism  as  an  appropriate  security  condition  for  non¬ 
deterministic  systems,  and  Zdancewic  and  Myers  [39]  give  a  security  condition  based 
on  observational  determinism  for  a  concurrent  language  based  on  the  join  calculus  [9], 

Observational  determinism  implies  noninterference  under  refinement  and  thus  im¬ 
munity  to  refinement  attacks.  In  settings  where  the  resolution  of  nondeterministic  choice 
may  depend  on  confidential  information,  we  conjecture  that  observational  determinism 
and  noninterference  under  refinement  are  equivalent.  However,  when  the  resolution  of 
some  choices  is  independent  of  confidential  information,  observational  determinism  is 
a  stronger  condition:  any  program  that  is  observationally  deterministic  satisfies  nonin¬ 
terference  under  refinement,  but  the  converse  does  not  hold. 


8  Conclusion 


This  paper  examines  information  flow  in  a  simple  imperative  language  that  includes 
primitives  for  communication  with  program  users.  In  this  setting,  it  is  not  the  initial 
values  of  variables  or  the  inputs  from  high  users  that  must  be  kept  secret,  but  rather 
the  high  users’  strategies.  We  present  a  trace-based  noninterference  condition  which 
ensures  that  low  users  do  not  learn  anything  about  the  strategies  employed  by  high 
users.  Incorporating  nondeterministic  and  probabilistic  choice  in  the  language  leads  to 
corresponding  security  conditions:  noninterference  under  refinement  and  probabilistic 
noninterference.  We  prove  that  a  type  system  conservatively  enforces  these  security 
conditions. 

This  work  is  a  step  toward  understanding  and  enforcing  information-flow  security 
in  real-world  programs.  Many  programs  interact  with  users,  and  the  behavior  of  these 
users  will  often  be  dependent  on  previous  inputs  and  outputs.  Also,  many  programs, 
especially  servers,  are  intended  to  run  indefinitely  rather  than  to  perform  some  com¬ 
putation  and  then  halt.  Our  model  of  interactivity  is  thus  more  suitable  for  analyzing 
real-world  systems  than  the  batch-job  model.  In  addition,  our  imperative  language  ap¬ 
proximates  the  implementation  of  real-world  interactive  programs  more  closely  than 
abstract  system  models  such  as  the  7r-calculus.  This  paper  thereby  contributes  to  under¬ 
standing  the  security  properties  of  programs  written  in  languages  with  information  flow 
control,  such  as  Jif  [26]  or  Flow  Caml  [33],  that  support  user  input  and  output. 
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A  Proof  Sketch  for  Theorem  2 

For  the  proof  of  Theorem  2,  we  treat  the  proof  for  nonprobabilistic  noninterference — 
that  is,  the  proof  of  parts  (a)  and  (b) — separately  from  the  proof  of  probabilistic  nonin¬ 
terference.  This  is  technically  unnecessary,  because  the  necessary  lemmas  for  the  prob¬ 
abilistic  proof  are  generalizations  of  the  lemmas  for  the  nonprobabilistic  proof.  We  take 
this  approach  so  that  we  can  prove  the  nonprobabilistic  lemmas,  which  are  simpler  to 
understand,  without  the  overhead  of  probability  trees  and  probability  distributions.  The 
nonprobabilistic  lemmas  are  proven  in  Section  A.l,  whereas  the  probabilistic  lemmas 
and  Theorem  2  are  proven  in  Section  A. 2. 

All  of  the  results  in  this  section  assume  the  existence  of  a  single  variable  typing  1. 
When  convenient,  we  avoid  specifying  I’  and  assume  that  the  typing  is  given. 

We  heavily  overload  the  symbol  ~i  to  represent  low  equivalence  relations.  We 
write  a  a'  to  denote  that  states  a  and  o'  are  low-equivalent  with  respect  to  T,  that 
is,  if  cr(  x)  =  o'(x)  whenever  T(a;)  =  L.  Refiners  ip,  ip'  £  Refare  low-equivalent, 
written  ip  ip',  if  ip(L)  =  ip'(L).  Similarly,  joint  strategies  to, to'  £  Strat  are  low- 
equivalent,  written  o o  to'  if  to(L)  =  to'(L).  Traces  t  and  t'  are  low-equivalent, 

written  t  t',  if  t  \L  =  t'  \  L. 

The  low-equivalence  relation  on  well-typed  commands,  denoted  c  c',  is  defined 
by  the  following  rules: 

(a)  c  c  for  all  commands  c; 

(b)  ifT  h  Ci  :  H  cmd  and  T  h  C2  :  H  cmd,  then  ci  c2\ 

(c)  if  r  h  ci  :  H  cmd  and  T  h  C2  :  H  cmd ,  then  ci;  c  C2;  c  for  all  commands  c; 
and 

fd)  ifT  h  cu  :  H  cmd ,  then  c#;  c  c  and  c  Ch\  c  for  all  commands  c. 

Property  1.  The  relation  ~  r  is  an  equivalence  relation  on  well-typed  commands. 

Proof.  Reflexivity  is  immediate  by  rule  (a),  and  symmetry  follows  because  the  rules 
themselves  are  symmetric.  Transitivity  follows  by  a  straightforward  analysis  of  each 
pair  of  rules.  □ 

Two  configurations  m  =  (c,  cr,  ip,  t,  oj)  and  m!  =  (c7,  a',  ip' ,  t' ,  u')  are  low-equiva¬ 
lent,  written  m  to',  if  c  c',  a  ~L  a',  ip  tp\  t  t',  and  u>  to' . 

A.l  Nonprobabilistic  Proof  Details 

The  following  lemma,  an  analogue  of  the  “Simple  Security”  lemma  of  [37],  demon¬ 
strates  that  low-typed  expressions  have  the  same  values  in  low-equivalent  states. 
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Lemma  1.  If  I  b  e  :  L,  then  bfx)  =  L  for  every  variable  x  appearing  in  e.  In 
particular,  if  T  b  e  :  L  and  o  a'.  then  cr(e)  =  o'(e). 

Proof.  By  induction  on  the  structure  of  e.  □ 

The  following  lemma  demonstrates  that  configurations  with  high-typed  commands 
take  steps  that  preserve  low-equivalence  (in  the  sense  that  no  low  events  are  emitted  and 
the  resulting  configuration  is  low-equivalent  to  the  initial  configuration). 

Lemma  2.  IfT  b  c  :  H  cmd,  then  for  all  o,  ip,  t,  and  u>,  if 

(c,o,ip,t,u>)  — >  (c',o',ip',t',u>'), 

then  (c,  o,  ip,  t,  to)  (c' ,  o',  if',  t' ,  a/),  and  moreover  T  b  c'  :  H  cmd. 

Proof.  By  induction  on  the  derivation  of  (c,  o,  ip,  t,  uj)  — >  ( c o' ,  ip' ,  t' ,  u /).  □ 

The  following  lemma  demonstrates  that  if  the  first  command  in  a  sequence  ter¬ 
minates  with  some  configuration,  then  the  sequence  eventually  steps  to  an  identical 
configuration  with  skip  replaced  by  the  second  command  in  the  sequence. 

Lemma  3.  For  all  c0,  C\,  o,  o',  ip,  ip' ,  t,  f  ,u>,  and  lu',  if 

(c0,o,ip,t,u)  — ►*  (skip ,o',ip',t',uj'), 


then 

(c0-,Ci,o,ip,t,u)  — >*  (ci ,o',1p',f,(x'). 

Proof.  By  induction  on  the  length  of  the  derivation  of 

(c0,o,ip,t,oj)  — >*  (skip,  o' ,ip' ,t' ,uj'), 
using  rule  SEQ-1  for  the  base  case  and  rule  Seq-2  for  the  inductive  case. 


□ 


The  following  lemma  demonstrates  that  high-typed  commands  always  terminate, 
and  that  the  resulting  terminal  configuration  is  low-equivalent  to  the  initial  configura¬ 
tion. 

Lemma  4.  IfT  b  c  :  H  cmd,  then  for  any  o,  ip,  t  and  uj  there  exists  cr',  ip',  t'  and  u' 
such  that 

0 c,o,ip,t.,u )  — >*  (skip ,o',ip',t',uj'), 


and 


(' c,o,ip,t,uj )  (skip,  o',  ip',  t',uj'). 


Proof.  Note  that  a  high-typed  command  cannot  contain  a  while-statement.  The  result 
follows  by  structural  induction  on  c,  using  Lemma  2  to  demonstrate  low  equivalence 
for  the  base  cases.  For  sequences  we  appeal  to  Lemma  3.  □ 

The  following  lemma  demonstrates  that  low-equivalent  configurations  with  the  same 
command  take  steps  that  preserve  low  equivalence. 
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Lemma  5.  For  all  c,  or,  cr2,  ipi,  ip2,  h,t2, Ui, tu2,  and  mi,  if 

(c,c 71,i/>1,t1,u>1)  (c,fT2,^2,i2,w2),  and  (c,  or,  V’l,  it,  Wi)  — ►  m  1; 

then  there  exists  a  configuration  m2  such  that 

(c,  cr2, 4>2,t2,  ui2)  — >  m2,  and  mi  m2. 

Proof.  By  induction  on  the  derivation  (c,  cr1;  if>1,  tx,  u>\)  — >  mi,  using  Lemma  1  for 
the  rules  Assign,  Out,  If-1,  and  If-2.  □ 

The  main  nonprobabilistic  lemma  demonstrates  that  the  traces  emitted  by  low- 
equivalent  configurations  are  low-equivalent. 

Lemma  6.  For  all  configurations  mi,  m2,  and  rn\ ,  if 

mi  TO2  and  mi  — ►*  rn\ , 


then  there  exists  a  configuration  m2  such  that 

m2  — m^  and  rn'i  ~ l  to2 . 

Proof.  By  induction  on  the  length  of  the  derivation  of  m\  — >*  rn\ .  The  base  case  is 
trivial.  Otherwise,  write  m\  =  {ci,oi,rpi,ti,u>i)  and  m2  =  (c2,  ct2, 1^2,  t-2,  w2),  and 
consider  the  cases  for  ci  c2: 

(a)  If  Ci  =  c2,  then  suppose  that  m  1  — >  rn"  and  m"  — rn\ .  By  Lemma  5,  there  is 
a  state  m2  such  that  m2  — >  to"  and  to"  to^.  We  can  then  apply  the  inductive 
hypothesis. 

(b)  If  ci  and  c2  are  both  high-typed,  suppose  that  m\  — >  to"  and  ?n"  — m!x.  By 
Lemma  2,  ?n"  is  low  equivalent  to  m2,  and  we  can  apply  the  inductive  hypothesis. 

(c)  If  ci  =  ch1  ;  c  and  c2  =  ch2  ;  c  for  some  command  c  and  high-typed  commands 

c //,  and  c//2,  then  consider  the  form  of  c//, .  If  c/ij  =  skip,  then  mi  — >  to, 
where  to  =  (c,cri,  and  since  (c,  cti,  ip\,  tx,  ui)  is  low  equivalent  to 

m2,  we  can  apply  the  inductive  hypothesis.  Otherwise,  by  Lemma  2  and  Seq-2, 
mi  — >  to"  for  some  rn”  such  that  is  low  equivalent  to  m2,  and  we  can  apply  the 
inductive  hypothesis. 

fd)  If  ci  =  cn1;c2,  then  consider  the  form  of  ch1-  If  Chx  =  skip,  then  mi  — > 
(c2,  <7i,  ijji,  tx,  o>i),  and  since  (c2,  <ti,  ipi,  tx,  u>x)  is  low  equivalent  to  to2,  we  can 
apply  the  inductive  hypothesis.  Otherwise,  by  Lemma  2  and  Seq-2,  mi  — >  to" 
such  that  to"  is  low  equivalent  to  ?n2,  and  we  can  apply  the  inductive  hypothesis. 
If  c2  =  Cff2 :  Ci,  then  by  Lemma  4  and  Lemma  3,  there  is  a  configuration  to2  = 
(ci,  cr2,  ip2 ,  t2,  u>2)  such  thatm2  — >*  to2  and  to2  to2,  and  thus  mi  to" . 
Suppose  mi  — >  to"  and  to"  — >*  rn\ .  Then  by  Lemma  5,  there  is  a  configuration 
to'2"  such  that  m2  — >  to2"  such  that  to"  to2",  and  we  can  apply  the  inductive 
hypothesis. 

□ 


The  first  two  cases  of  Theorem  2  follow  directly  from  this  result. 
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A.2  Probabilistic  Proof  Details 


We  now  generalize  the  results  of  the  previous  section  to  account  for  probabilistic  pro¬ 
grams.  The  structure  of  the  proof  is  similar  to  the  nonprobabilistic  results. 

Given  a  vertex  v  of  a  probability  tree  Tm,  let  Tv  denote  the  subtree  of  Tm  rooted 
at  v,  and  let  V„  and  7 Zv  denote  the  sets  of  vertices  and  rays  of  Tv,  respectively.  We 
denote  the  configuration  with  which  v  is  labeled  as  cf(v),  and  we  write  v  v'  if 
cf{v)  ~L  cf{v'). 

Let  a frontier  set  of  a  vertex  v  be  a  finite  set  of  vertices  SC  V„  such  that  for  every 
ray  r  G  7 Zv  there  exists  exactly  one  vertex  from  S  on  r.  Given  a  frontier  set  S  of  v,  we 
call  F  =  ( v ,  S)  a  frontier.  Note  that  {r>}  is  a  frontier  set  of  v,  and  that  given  any  frontier 
F  we  can  obtain  a  new  frontier  F'  by  replacing  any  vertex  in  the  frontier  set  with  all 
of  its  descendants.  Note  also  that  a  frontier  set  S  partitions  7 Zv  into  sets  of  rays  that  go 
through  particular  vertices  in  S. 

Define  the  depth  of  a  frontier  (v,  S)  to  be  the  length  of  the  longest  path  (that  is,  the 
number  of  edges  in  the  longest  path)  between  v  and  vertices  in  S. 

Because  the  vertices  in  a  frontier  F  =  (v,  S)  induce  a  partition  on  the  sets  of  rays 
going  through  v,  the  function  n  on  vertices  gives  rise  to  a  discrete  probability  measure 
on  sets  of  vertices  on  S,  normalized  by  the  value  of  tt(v).  More  concretely,  for  any 
vertex  v'  in  a  frontier  set  S  of  v,  let  nv(vr)  be  the  product  of  the  probabilities  on  the 
path  from  v  to  v' .  We  can  now  compare  the  distribution  of  low-equivalent  configurations 
in  two  different  frontiers.  Given  a  frontier  F  =  (v.  S)  and  a  configuration  m ,  let 

[m]F  =  W  G  S  |  cf(v' )  m} 

be  the  subset  of  S  whose  configurations  are  low-equivalent  to  to.  Define  two  frontiers 
F  =  (v,  S)  and  F'  =  (v',  S')  to  be  low-equivalent,  denoted  F  F',  if  for  any 
configuration  to  we  have 


7Tv(v")=  v ")■ 

v"€.[m\F  v"€.[m\Fi 

The  following  lemma,  which  generalizes  Lemma  3,  demonstrates  that  if  the  first 
command  in  a  sequence  terminates  in  all  execution  paths,  then  the  sequence  eventually 
steps,  in  all  execution  paths,  to  the  second  command,  while  preserving  other  aspects  of 
the  original  terminal  configuration. 

Lemma  7.  If  v  is  a  vertex  in  a  probability  tree  such  that  cf(v)  =  (co;  c\,  o,  ip,  t,  lu), 
and  Fq  =  (vo,  So)  is  a  frontier  such  that 

-  cf(v  o)  =  (co,<T,1p,t,w), 

-  the  subtree  rooted  at  uq  is  finite,  and 

-  So  consists  of  the  root  vertices  of  the  subtree  rooted  at  Vq, 

then  there  exists  a  frontier  F  =  (v,S)  and  a  one-to-one  mapping  g  :  So  — ^ >  S  such  that 
if  v'0  G  S0  and  cf(v'0)  =  (skip,  o',  ip',  f,  w'),  we  have  cf(g(v'0))  =  (ci,  o',  ip',  t' ,  u'). 

Proof.  By  induction  on  the  depth  of  Fq,  using  rules  Seq-1  and  Seq-2.  □ 
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The  following  lemma,  which  generalizes  Lemma  4,  demonstrates  that  high-typed 
commands  terminate  in  all  execution  paths  and  that  terminal  configurations  are  low- 
equivalent  to  the  initial  configuration. 

Lemma  8.  If  v  is  a  vertex  in  a  probability  tree  such  that  cf(v)  =  (c,cr,ip,t,uj)  and 
r  b  c  :  H  cmd ,  then  the  subtree  rooted  at  v  is  finite  and  that  for  any  leaf  vertex  v'  of 
that  subtree  we  have  v'  v. 

Proof.  By  structural  induction  on  c,  using  Lemma  7  for  sequences  and  Lemma  2  to 
demonstrate  low-equivalence  for  the  base  cases.  □ 

The  following  lemma,  generalizing  Lemma  5,  demonstrates  that  low-equivalent  ver¬ 
tices  have  low-equivalent  sets  of  children. 

Lemma  9.  If  F\  =  (iq,  Si)  and  F2  =  ( v2 ,  Sf)  are  frontiers  such  that 

-  Vi  ~l  v2, 

-  S i  and  S2  are  the  sets  of  children  of  iq  and  v2,  and 

-  cf(v i)  and  cf(v 2)  share  the  same  command  c, 

then  f1!  T2- 

Proof.  By  structural  induction  on  c,  using  Lemma  1  for  the  rules  Assign,  Out,  If-1, 
and  If-2.  □ 

The  following  lemma  is  useful  for  the  inductive  cases  of  Lemma  11.  It  states  that 
we  can  combine  frontiers  of  vertices  in  low-equivalent  frontiers  to  obtain  deeper  low- 
equivalent  frontiers. 

Lemma  10.  If  F\  =  (iq,  Si)  and  F2  =  (v2.  S2)  are  frontiers  such  that 

-  Fi  F2\ 

-  g-[  is  a  mapping  from  Si  to  frontier  sets  such  that  for  any  v  £  Si,  ( v ,  gi{v))  is  a 
frontier; 

-  g2  is  a  mapping  from  S2  to  frontier  sets  such  that  for  any  v  £  S2,  ( v ,  g2(v ))  is  a 
frontier;  and 

-  for  all  v[  £  S\  and  v'2  €  S2  such  that  v[  v2 ,  we  have  {v^, g^v'fj) 

W^9iW2))\ 

then  F[  =  (vi,Uves1gi(v))  and  F2  =  (^2,  U„6s2(i2(^))  are  frontiers  such  that  F[  ~L 

F' 

r2- 

Proof.  F[  and  F2  are  frontiers,  which  follows  directly  from  the  definition  of  a  frontier 
set.  To  demonstrate  the  low-equivalence  of  F[  and  F2,  we  must  establish  that  for  all 
configurations  to  we  have 


7r»i(^)=  H  ’L’sO)- 

'tHuj 
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Let  M.  denote  a  set  of  class  representatives  (for  the  equivalence  relation  ~  /  )  of  the 
set  of  configurations  with  which  the  elements  of  Si  U  S2  are  labeled.  We  have 

Eue[m]F/  7r«i('f)  —  Eu]eFi  E-ue[m]gi(t)/ j  7ri'i(vl)  ■  7rv'1(v) 

Zjm'GM  E»j£[m']Fl  Et)6[ml9l(„i)  (^0 

—  Zm'GM  7rfl('l’l)  ‘  7rt>i('u)- 

However,  by  assumption,  for  any  configuration  m  and  for  any  v[  £  Si  and  v2  £  S2 
such  that  v[  v2,  we  have  (v[,  ffi(r’i))  32(^2))- and  thus 

5/  sW=  5/ 

«£H  i(H)  ,eWS2(,y 

In  fact,  for  any  w  £  Si  such  that  i>i  ic,  we  also  have 

m(^)=  51 

»eW9l(,')  »eWsl(«) 

Thus,  this  sum  depends  only  on  the  low-equivalence  class  of  t;' .  We  capture  this  by 
defining  s(m',  to)  to  be  equal  to  this  sum,  resulting  in  the  following  equalities: 


5(777/,  m)  = 

Evelml  ,  nv[  0)  for  any  v[  £  S 1  such  that 

L  J91  V."0!  / 

v[  ~L  rn ' 

— 

E ve[m\g2(vl)  nv'2( V )  for  any  v’2  £  S2  such  that 

V2  m! . 

We  therefore  have 

E»e[m]p|  7rvi(v) 

=  Zm'GM  Zvie[m']Fl  «)  •  SK> m ) 

=  Zm'GM  ■  E„'e[m']Fl  ^iK) 

=  Zm'GM  ■  Zv'2e[m']F2  -*v2(v'2) 

[as  F\  F2] 

~  ZvG[m]F^  lrv2(v)i 

[similarly] 

as  desired.  □ 


We  can  now  state  the  main  lemma,  which  generalizes  Lemma  6. 

Lemma  11.  If  F\  =  (iq,  Si)  is  a  frontier  and  v2  iq,  then  there  exists  a  frontier  set 
S2  of  v2  such  that  Fi  (v2,  S2). 

Proof.  By  induction  on  the  depth  of  Fi.  The  base  case  is  trivial,  and  cases  (a)-(d) 
are  analogues  of  the  cases  in  the  nonprobabilistic  proof.  As  before,  write  cf(y  1)  = 
and  c/(i>2 )  =  (02,  cr2,  i/j2,  t2,  lu2),  and  consider  the  cases  for  Ci 
C2: 

(a)  Suppose  that  ci  =  C2.  If  S[  and  S'2  are  the  sets  of  children  of  iq  and  v2,  we  have 
(iq,Sq)  (v2,S2)  by  Lemma  9.  The  inductive  hypothesis  applies  to  v[  £  S[ 
(with  appropriate  frontier  sets  C  Si)  and  elements  v2  £  S2  such  that  v[  v2, 
and  the  result  follows  by  Lemma  10. 
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(b)  If  ci  and  C2  are  both  high-typed,  the  result  follows  by  Lemma  2,  the  inductive 
hypothesis  applied  to  the  children  of  V\,  and  Lemma  10. 

(c)  If  ci  =  c a  ;  c  and  C2  =  c//2 ;  c  for  some  command  c  and  high-typed  commands 
Ch !  and  cjy2,  let  S[  be  the  set  of  children  of  iq.  For  each  v[  £  S[  we  have  v[ 

V\  v2  (by  Seq-1  if  ch1  is  skip,  or  by  Lemma  2  and  Seq-2  otherwise),  and  we 
can  therefore  apply  the  inductive  hypothesis.  The  result  follows  from  Lemma  10. 

(d)  If  ci  =  ChP,  C2,  we  can  apply  the  inductive  hypothesis  using  the  same  reasoning 
used  for  case  (c).  Otherwise  we  have  C2  =  chx ;  Ci.  Let  S[  be  the  set  of  children 
of  Ci.  By  Lemma  7  and  Lemma  8,  there  exists  a  frontier  set  S2  of  v2  such  that  for 
every  element  v2  £  S2,  v2  is  labeled  with  a  configuration  whose  command  is  ci, 
and  v2  v2  V\.  Given  any  v2  £  S2 ,  let  S,,^  be  the  set  of  children  of  v2.  By 
Lemma  9  we  have  (tq,  5Q  ( v2 ,  Sv'2 ),  and  the  inductive  hypothesis  applies  to 
elements  v[  £  S[  as  in  case  (a).  By  Lemma  10  we  can  combine  the  frontiers  of 
the  elements  of  S',,'  to  get  a  frontier  if,/  =  (v2  ,SV')  such  that  if,/  F\.  Let 
S2  =  Ut,'Ss'  Sv'2 .  We  have  (v2,  S2)  F1  by  Lemma  10. 

□ 

We  are  now  ready  to  prove  Theorem  2.  We  do  so  by  establishing  a  connection 
between  /LiTO(£m(t)),  the  probability  that  configuration  to  emits  trace  t,  and  the  prob¬ 
abilities  of  vertices  of  arbitrarily  deep  frontiers  of  Tm.  Given  a  trace  t  and  probability 
tree  Tm  with  root  vertex  vr  and  frontier  (vr,  S ),  define: 

£s(t)  —  {r  £  lZm  |  there  exists  a  vertex  v  £  S  on  r  and  a  trace  t' 

such  that  tr(v)  extends  i!  and  t1  \  L  =  t\  L}. 

The  set  £s(t)  consists  of  those  rays  on  which  traces  that  are  low-equivalent  to  t  appear  at 
vertices  that  are  ancestors  of  elements  in  S.  Intuitively,  /xm(£s(i))  is  an  approximation 

Of  Hm(£m(t)). 

Theorem  2  (Soundness).  For  any  command  c,  if  there  exists  a  variable  typing  T  and  a 
security  type  r  such  that  T  h  c  :  r  cmd,  then 

(a)  if  c  does  not  contain  nondeterministic  or  probabilistic  choice,  then  c  satisfies  non¬ 
interference; 

(b)  if  c  does  not  contain  probabilistic  choice,  then  c  satisfies  noninterference  under 
refinement;  and 

(c)  c  satisfies  probabilistic  noninterference. 

Proof.  That  c  satisfies  noninterference  and  noninterference  under  refinement  follows 
from  Lemma  6.  To  demonstrate  that  c  satisfies  probabilistic  noninterference,  we  must 
show  that,  for  all  low-equivalent  configurations  m  and  to'  and  traces  t,  that  /zTO(£TO(f))  = 
Pm'{£m'{t))-  We  demonstrate  that  /rTO(£m(f))  <  /im'  (£m'  (t))',  the  reverse  inequality 
is  symmetric.  Suppose,  by  way  of  contradiction,  that  f. tm(£m(f))  >  ft m'(£m'(t ))•  We 
demonstrate  below  that  there  exists  a  sequence  of  frontier  sets  {Si}  of  the  root  vertex  vr 
of  Tm  such  that  (t))  converges  to  iirn(£rn(t)).  It  follows  there  exists  a  frontier 

set  S  of  Tm  such  that  Hm{£s{t))  >  But  by  Lemma  11,  there  exists  a  fron¬ 

tier  F'  =  (v'r,S')  of  Tmf  such  that  F'  (vr,S).  Thus  pm'(£s'(t))  =  Pm{£s{t)), 
and  Hm'  (£s’(t))  >  Pm’  {£m‘  (t)).  This  is  a  contradiction,  because  £s'(t)  C  £mi(t). 
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We  now  exhibit  a  sequence  of  frontier  sets  {£*}  such  that  /xm(£si  (t))  converges  to 
Pmi,£mif))  ■  Consider  the  sequence  Sq,  S\, . . .  ,Si, . . .  of  frontier  sets  of  vr  that  com¬ 
prise  all  the  vertices  at  depth  i  of  T.m.  We  have  £m(t)  =  Uj>o£s;(i),  and  for  all  i  we 
have  £Si  (i)  C  £si+1(t).  Convergence  follows  due  to  a  standard  result  in  probability 
theory  [2].  □ 


References 

1.  Ana  Almeida  Matos,  Gerard  Boudol,  and  Ilaria  Castellani.  Typing  noninterference  for  reac¬ 
tive  programs.  In  Proc.  Workshop  on  Foundations  of  Computer  Security,  2004. 

2.  Patrick  Billingsley.  Probability  and  Measure.  Wiley-Interscience,  3rd  edition,  April  1995. 

3.  Gerard  Boudol  and  Ilaria  Castellani.  Noninterference  for  concurrent  programs  and  thread 
systems.  Lecture  Notes  in  Computer  Science,  281(1):  109-130,  2002. 

4.  Michael  R.  Clarkson,  Andrew  C.  Myers,  and  Fred  B.  Schneider.  Belief  in  information  flow. 
In  Proc.  18th  IEEE  Computer  Security  Foundations  Workshop,  pages  31-45,  June  2005. 

5.  Dorothy  E.  Denning.  A  lattice  model  of  secure  information  flow.  Comm,  of  the  ACM, 
19(5):236— 243.  1976. 

6.  Riccardo  Focardi  and  Roberto  Gorrieri.  Classification  of  security  properties  (Part  I:  Infor¬ 
mation  flow).  In  Foundations  of  Security  Analysis  and  Design,  pages  331-396.  Springer, 
2001. 

7.  Riccardo  Focardi  and  Sabina  Rossi.  Information  flow  security  in  dynamic  contexts.  In  Proc. 
15th  IEEE  Computer  Security  Foundations  Workshop,  pages  307-319,  2002. 

8.  Riccardo  Focardi,  Sabina  Rossi,  and  Andrei  Sabelfeld.  Bridging  language-based  and  process 
calculi  security.  In  Proc.  of  Foundations  of  Software  Science  and  Computation  Structures 
(FOSSACS’05),  volume  3441  of  LNCS,  April  2005. 

9.  Cedric  Fournet  and  Georges  Gonthier.  The  Reflexive  CHAM  and  the  Join-Calculus.  In  Conf. 
Record  23rd  ACM  Symposium  on  Principles  of  Programming  Languages,  pages  372-385, 
1996. 

10.  Joseph  A.  Goguen  and  Jose  Meseguer.  Security  policies  and  security  models.  In  Proc.  IEEE 
Symposium  on  Security  and  Privacy,  pages  1 1-20,  1982. 

11.  James  W.  Gray  III  and  Paul  F.  Syverson.  A  logical  approach  to  multilevel  security  of  proba¬ 
bilistic  systems.  Distributed  Computing,  1 1(2):73 — 90,  1998. 

12.  Joseph  Y.  Halpern.  Reasoning  About  Uncertainty.  MIT  Press,  Cambridge,  Mass.,  2003. 

13.  Joseph  Y.  Halpern  and  Kevin  R.  O'Neill.  Secrecy  in  multiagent  systems.  In  Proc.  15th  IEEE 
Computer  Security  Foundations  Workshop,  pages  32^16,  2002. 

14.  Joseph  Y.  Halpern  and  Kevin  R.  O’Neill.  Secrecy  in  multiagent  systems.  Available  at 
http://arxiv. org/pdf /cs .CR/0307057,  2005. 

15.  Joseph  Y.  Halpern  and  Mark  Tuttle.  Knowledge,  probability,  and  adversaries.  Journal  of  the 
ACM,  40(4):9 17-962,  1993. 

16.  Jifeng  He.  K.  Seidel,  and  A.  Mclver.  Probabilistic  models  for  the  guarded  command  lan¬ 
guage.  Science  of  Computer  Programming,  28:171-192,  1997. 

17.  C.A.R.  Hoare.  Communicating  Sequential  Processes.  Prentice-Hall,  1985. 

18.  Sebastian  Hunt  and  Dave  Sands.  On  flow-sensitive  security  types.  In  Conf.  Record  33rd 
ACM  Symposium  on  Principles  of  Programming  Languages,  2006. 

19.  Heiko  Mantel.  A  uniform  framework  for  the  formal  specification  and  verification  of  infor¬ 
mation  flow  security.  PhD  thesis,  Universitat  des  Saarlandes,  2003. 

20.  Heiko  Mantel  and  Andrei  Sabelfeld.  A  unifying  approach  to  the  security  of  distributed  and 
multi-threaded  programs.  Journal  of  Computer  Security,  1 1(4):615— 676.  September  2003. 


27 


21.  Daryl  McCullough.  Specifications  for  multi-level  security  and  a  hook-up  property.  In  Proc. 
IEEE  Symposium  on  Security  and  Privacy ,  pages  161-166,  1987. 

22.  John  McLean.  Proving  noninterference  and  functional  correctness  using  traces.  Journal  of 
Computer  Security,  1(1):37— 58,  1992. 

23.  John  McLean.  A  general  theory  of  composition  for  trace  sets  closed  under  selective  inter¬ 
leaving  functions.  In  Proc.  IEEE  Symposium  on  Security  and  Privacy,  pages  79-93,  1994. 

24.  R.  Milner.  A  Calculus  of  Communicating  Systems.  Lecture  Notes  in  Computer  Science, 
Volume  92.  Springer- Verlag,  Berlin/New  York,  1980. 

25.  Robin  Milner.  Processes:  A  mathematical  model  of  computing  agents.  In  H.  E.  Rose  and 
J.  C.  Shepherdson,  editors.  Proceedings  of  the  Logic  Colloquium,  Bristol,  July  1973,  pages 
157-173,  New  York.  1975.  American  Elsevier  Pub.  Co. 

26.  Andrew  C.  Myers,  Lantian  Zheng,  Steve  Zdancewic,  Stephen  Chong,  and 
Nathaniel  Nystrom.  Jif:  Java  information  flow.  Software  release.  Located  at 
http  :  //www .  cs  .  Cornell .  edu/  jif,  2001-2005. 

27.  A.  W.  Roscoe.  CSP  and  determinism  in  security  modeling.  In  Proc.  IEEE  Symposium  on 
Security  and  Privacy,  1995. 

28.  Peter  Y.  A.  Ryan  and  Steve  A.  Schneider.  Process  algebra  and  non-interference.  In  Proc. 
12th  IEEE  Computer  Security  Foundations  Workshop,  pages  214-227,  1999. 

29.  Andrei  Sabelfeld  and  Heiko  Mantel.  Static  confidentiality  enforcement  for  distributed  pro¬ 
grams.  In  Proceedings  of  the  9th  International  Static  Analysis  Symposium,  volume  2477  of 
LNCS,  Madrid,  Spain,  September  2002.  Springer- Verlag. 

30.  Andrei  Sabelfeld  and  Andrew  C.  Myers.  Language-based  information-flow  security.  IEEE 
Journal  on  Selected  Areas  in  Communications,  21(1):5 — 19,  January  2003. 

31.  Andrei  Sabelfeld  and  David  Sands.  Probabilistic  noninterference  for  multi-threaded  pro¬ 
grams.  In  Proc.  13th  IEEE  Computer  Security  Foundations  Workshop,  pages  200-214.  IEEE 
Computer  Society  Press,  July  2000. 

32.  Vincent  Simonet.  Fine-grained  information  flow  analysis  for  a  lambda-calculus  with  sum 
types.  In  Proc.  15th  IEEE  Computer  Security  Foundations  Workshop,  pages  223-237,  June 
2002. 

33.  Vincent  Simonet.  The  Flow  Caml  System:  Documentation  and  user’s  manual.  Technical 
Report  0282,  Institut  National  de  Recherche  en  Informatique  et  en  Automatique  (INRIA), 
July  2003. 

34.  Geoffrey  Smith.  A  new  type  system  for  secure  information  flow.  In  Proc.  14th  IEEE  Com¬ 
puter  Security  Foundations  Workshop,  pages  115-125,  2001. 

35.  Moshe  Y.  Vardi.  Automatic  verification  of  probabilistic  concurrent  finite-state  programs.  In 
Proc.  26th  IEEE  Symp.  on  Foundations  of  Computer  Science,  pages  327-338,  1985. 

36.  Dennis  Volpano  and  Geoffrey  Smith.  Probabilistic  noninterference  in  a  concurrent  language. 
Journal  of  Computer  Security,  7(2, 3):23 1-253,  November  1999. 

37.  Dennis  Volpano.  Geoffrey  Smith,  and  Cynthia  Irvine.  A  sound  type  system  for  secure  flow 
analysis.  Journal  of  Computer  Security,  4(3):  1 67-1 87,  1996. 

38.  J.  Todd  Wittbold  and  Dale  M.  Johnson.  Information  flow  in  nondeterministic  systems.  In 
Proc.  IEEE  Symposium  on  Security  and  Privacy,  pages  144-161.  May  1990. 

39.  Steve  Zdancewic  and  Andrew  C.  Myers.  Observational  determinism  for  concurrent  program 
security.  In  Proc.  16th  IEEE  Computer  Security  Foundations  Workshop,  pages  29 — 43,  Pacific 
Grove,  California,  June  2003. 


28 


